Skip to content

[InstCombine] Rewrite multi-use GEPs when simplifying comparison #146100

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jul 1, 2025

Conversation

nikic
Copy link
Contributor

@nikic nikic commented Jun 27, 2025

We already do this when both sides are a GEP, but not if only one is. This ensures that the offset arithmetic is not duplicated.

@nikic nikic requested a review from dtcxzyw June 27, 2025 15:53
@llvmbot llvmbot added llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms labels Jun 27, 2025
@llvmbot
Copy link
Member

llvmbot commented Jun 27, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Nikita Popov (nikic)

Changes

We already do this when both sides are a GEP, but not if only one is. This ensures that the offset arithmetic is not duplicated.


Full diff: https://github.com/llvm/llvm-project/pull/146100.diff

3 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/icmp-gep.ll (+28)
  • (modified) llvm/test/Transforms/PhaseOrdering/loop-access-checks.ll (+2-1)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
index 0894ca92086f3..6de1f8558e8cd 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
@@ -711,7 +711,7 @@ Instruction *InstCombinerImpl::foldGEPICmp(GEPOperator *GEPLHS, Value *RHS,
   Value *PtrBase = GEPLHS->getOperand(0);
   if (PtrBase == RHS && CanFold(GEPLHS->getNoWrapFlags())) {
     // ((gep Ptr, OFFSET) cmp Ptr)   ---> (OFFSET cmp 0).
-    Value *Offset = EmitGEPOffset(GEPLHS);
+    Value *Offset = EmitGEPOffset(GEPLHS, /*RewriteGEP=*/true);
     return NewICmp(GEPLHS->getNoWrapFlags(), Offset,
                    Constant::getNullValue(Offset->getType()));
   }
diff --git a/llvm/test/Transforms/InstCombine/icmp-gep.ll b/llvm/test/Transforms/InstCombine/icmp-gep.ll
index 260462896c39d..1e48a38803fdb 100644
--- a/llvm/test/Transforms/InstCombine/icmp-gep.ll
+++ b/llvm/test/Transforms/InstCombine/icmp-gep.ll
@@ -221,6 +221,34 @@ define i1 @eq_base_inbounds_commute_use(i64 %y) {
   ret i1 %r
 }
 
+define i1 @ne_base_inbounds_use_scaled(ptr %x, i64 %y) {
+; CHECK-LABEL: @ne_base_inbounds_use_scaled(
+; CHECK-NEXT:    [[G_IDX:%.*]] = shl nsw i64 [[Y:%.*]], 3
+; CHECK-NEXT:    [[G:%.*]] = getelementptr inbounds i8, ptr [[X:%.*]], i64 [[G_IDX]]
+; CHECK-NEXT:    call void @use(ptr [[G]])
+; CHECK-NEXT:    [[R:%.*]] = icmp ne i64 [[Y]], 0
+; CHECK-NEXT:    ret i1 [[R]]
+;
+  %g = getelementptr inbounds i64, ptr %x, i64 %y
+  call void @use(ptr %g)
+  %r = icmp ne ptr %g, %x
+  ret i1 %r
+}
+
+define i1 @ne_base_use_scaled(ptr %x, i64 %y) {
+; CHECK-LABEL: @ne_base_use_scaled(
+; CHECK-NEXT:    [[G_IDX_MASK:%.*]] = shl i64 [[Y:%.*]], 3
+; CHECK-NEXT:    [[G:%.*]] = getelementptr i8, ptr [[X:%.*]], i64 [[G_IDX_MASK]]
+; CHECK-NEXT:    call void @use(ptr [[G]])
+; CHECK-NEXT:    [[R:%.*]] = icmp ne i64 [[G_IDX_MASK]], 0
+; CHECK-NEXT:    ret i1 [[R]]
+;
+  %g = getelementptr i64, ptr %x, i64 %y
+  call void @use(ptr %g)
+  %r = icmp ne ptr %g, %x
+  ret i1 %r
+}
+
 define i1 @eq_bitcast_base(ptr %p, i64 %x) {
 ; CHECK-LABEL: @eq_bitcast_base(
 ; CHECK-NEXT:    [[GEP_IDX_MASK:%.*]] = and i64 [[X:%.*]], 9223372036854775807
diff --git a/llvm/test/Transforms/PhaseOrdering/loop-access-checks.ll b/llvm/test/Transforms/PhaseOrdering/loop-access-checks.ll
index e53b8687af915..45f18dd567396 100644
--- a/llvm/test/Transforms/PhaseOrdering/loop-access-checks.ll
+++ b/llvm/test/Transforms/PhaseOrdering/loop-access-checks.ll
@@ -24,7 +24,8 @@ define void @test_fill_with_foreach([2 x i64] %elems.coerce) {
 ; CHECK-NEXT:    [[ELEMS_COERCE_FCA_0_EXTRACT:%.*]] = extractvalue [2 x i64] [[ELEMS_COERCE]], 0
 ; CHECK-NEXT:    [[TMP0:%.*]] = inttoptr i64 [[ELEMS_COERCE_FCA_0_EXTRACT]] to ptr
 ; CHECK-NEXT:    [[ELEMS_COERCE_FCA_1_EXTRACT:%.*]] = extractvalue [2 x i64] [[ELEMS_COERCE]], 1
-; CHECK-NEXT:    [[ADD_PTR_I:%.*]] = getelementptr inbounds i32, ptr [[TMP0]], i64 [[ELEMS_COERCE_FCA_1_EXTRACT]]
+; CHECK-NEXT:    [[ADD_PTR_I_IDX:%.*]] = shl nsw i64 [[ELEMS_COERCE_FCA_1_EXTRACT]], 2
+; CHECK-NEXT:    [[ADD_PTR_I:%.*]] = getelementptr inbounds i8, ptr [[TMP0]], i64 [[ADD_PTR_I_IDX]]
 ; CHECK-NEXT:    [[CMP_NOT_I_I_I_I:%.*]] = icmp slt i64 [[ELEMS_COERCE_FCA_1_EXTRACT]], 0
 ; CHECK-NEXT:    br i1 [[CMP_NOT_I_I_I_I]], label [[ERROR:%.*]], label [[FOR_COND_PREHEADER_SPLIT:%.*]]
 ; CHECK:       for.cond.preheader.split:

Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. The only downside is that rewriting of GEPs may lose nuw flags when it is also used by a nuw pointer difference. This pattern is introduced by Rust's intrinsic ptr_offset_from_unsigned. https://doc.rust-lang.org/src/core/ptr/const_ptr.rs.html#772 https://github.com/rust-lang/rust/blob/d41e12f1f4e4884c356f319b881921aa37040de5/compiler/rustc_codegen_ssa/src/mir/intrinsic.rs#L497-L500
https://alive2.llvm.org/ce/z/ALxH4R

@nikic
Copy link
Contributor Author

nikic commented Jun 28, 2025

This has an interesting impact on clang stage2: https://llvm-compile-time-tracker.com/compare.php?from=c503c5e26baf942773379e1b445e6730066005a9&to=a9122c1e6a652860a7636e15ae5e227538161f47&stat=instructions:u Most likely perturbing an inlining heuristic in a way that happens to be favorable.

@nikic
Copy link
Contributor Author

nikic commented Jul 1, 2025

This has an interesting impact on clang stage2: https://llvm-compile-time-tracker.com/compare.php?from=c503c5e26baf942773379e1b445e6730066005a9&to=a9122c1e6a652860a7636e15ae5e227538161f47&stat=instructions:u Most likely perturbing an inlining heuristic in a way that happens to be favorable.

I've investigated this, and it turned out to be runtime unrolling rather than inlining, in a fairly peculiar way. The relevant method is DenseMap::initEmpty():

void initEmpty() {
setNumEntries(0);
setNumTombstones(0);
assert((getNumBuckets() & (getNumBuckets() - 1)) == 0 &&
"# initial buckets must be a power of two!");
const KeyT EmptyKey = getEmptyKey();
for (BucketT *B = getBuckets(), *E = getBucketsEnd(); B != E; ++B)
::new (&B->getFirst()) KeyT(EmptyKey);
}

The reason for the difference is that currently, runtime unrolling fails the high cost expansion check. But after this change, the offset for the last element is already calculated with a separate instruction, and thus does not count towards the expansion cost.

So I think the change there is not problematic.

@nikic nikic merged commit b8b7494 into llvm:main Jul 1, 2025
10 checks passed
@nikic nikic deleted the icmp-gep-rewrite-2 branch July 1, 2025 12:26
rlavaee pushed a commit to rlavaee/llvm-project that referenced this pull request Jul 1, 2025
…m#146100)

We already do this when both sides are a GEP, but not if only one is.
This ensures that the offset arithmetic is not duplicated.
rlavaee pushed a commit to rlavaee/llvm-project that referenced this pull request Jul 1, 2025
…m#146100)

We already do this when both sides are a GEP, but not if only one is.
This ensures that the offset arithmetic is not duplicated.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants